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Abstract. Using Stein's method for the Beta distributions and a recent tech- 
nique by Goldstein and Reinert of comparing the Stein characterization of the 
target distribution with that of the approximating distribution we prove a rate of 
convergence in the classical arcsine law, which states that the distribution of the 
relative time spent positive by a symmetric random walk on Z converges weakly 
to the arcsine distribution on [0, 1]. 



1. Introduction 

Consider a game between two players A and B that consists of consecutive tossings 
of a fair coin. Each time the coin shows heads, player A has to pay one dollar to 
player B and conversely, each time the coin shows tails, player A obtains one dollar 
from player B. If we consider the process that gives for each discrete time n the 
current fortune of player A, then by the symmetry of the model one is led to the 
conjecture, that for n sufficiently large, the relative amount of time, that player A is 
in the lead should be roughly one-half. The so-called (first) arcsine law states, that 
this intuition is entirely wrong. In fact, it is more likely that one of the players will 
lead for nearly all of the time. 

Let (Sk)k>o be the symmetric random walk on Z, i.e. we have 

Sk '■— Yl / j=i £ j — 0) f° r i-i.d. random variables £i,£2, ••■ with P{s\ = 1) = 
P(si = -1) = \- Letting Xj : = l^-^o.S^o}, T m := Yu™i X h R ™ ■= \T m and 
W m := j-T m = ^R m it is a classical result, first proven by Paul Levy for Brownian 
motion, that as m — > oo we have 



2 ' 2 



where, for a, b > v a ^ denotes the Beta distribution to the parameters a and b on 
[0, 1], which has density q a ,b{.x) := B ^ a ^ x ct ^ 1 (l — a;) 6 ~ 1 l(o,i)(a;) with B(a,b) denoting 
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the Beta function. Consequently, v := 1/1 i is the arcsine distribution on [0, 1] with 
density q(x) := qi i(x) = - , 1 =1(q,i)(^)- A proof of this well-known theorem can 

yx(l— x) 
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be found for example in [Fel68j. 
Recently, there has been some progress in Stein's method for the family of Beta 
distributions by Goldstein and Reinert (see |GR12j ) and by the author of the present 
article (see |D6bl2j). Both of these preprints prove rates of convergence in a Polya 
urn model. In this paper we will use the general results from the preprints |GR12] 
and [D6bl2j and especially the technique of comparing the Stein characterization of 
the target distribution with that of the (discrete) approximating distribution from 
|GR12] to prove bounds on the Wasserstein distance of the distribution C(W m ) of 
Wm to the arcsine distribution v. 



2. Stein's method for the arcsine distribution and for C(R m ) 

Stein's method is very useful tool for proving distributional convergence. Its main 
advantage over other techniques is, that it automatically yields concrete error bounds 
on various distributional distances. Since its introduction in 1972 in the seminal 
paper |Ste72] by Charles Stein for the univariate standard normal distribution, there 
has been much progress in adapting Stein's idea of linking a characterizing operator 
for the target distribution to a differential equation (in the absolutely continuous 
case) or to a difference equation (in the discrete case), the Stein equation, to other 
distributions, as for example the Poisson distribution (see |Che75j ). the Gamma 
distribution (see |Luk94j), the exponential distribution (see |CFR11| and [PRllJ), the 
geometric distribution (see [PRRJ) and many others. For a general introduction to 
Stein's method we refer to the book [CGSllj which emphasizes normal approximation 
but also treats approximation by other distributions. Here, we will make use of the 
recent development of Stein's method for the Beta distributions (specialized to the 
arcsine distribution), which was done independently and with different emphases 
in |GR1 2j and [Dob 12]. Furthermore, we will heavily use the approach from |GR12] 
for finding various Stein characterizations of a distribution supported on Z. This 
topic was also explored in [LSJ. 

We start with Stein's method for the arcsine distribution v. The following result, 
a slight variant of Proposition 3.1 in [D ob 12] . gives a Stein characterization for v. 



We denote by /C the class of all continuous and piecewise continuously different iable 
functions / : K. — > R vanishing at infinity with J R \f'(x)\^x(l — x)dx < oo. 

PROPOSITION 2.1. A real-valued random variable X has the arcsine distribution v 
on [0, 1] if and only if for all functions f E K, the expected values E[X(1 — X)f'(X)\ 
and E\(X — |)/(X)] exist and coincide. 
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According to Stein's idea, by Proposition 12. 1[ for a given i/-integrable test function 
h : R — > R one is led to consider the Stein equation 

(1) x(l - x)f'(x) + (--*) f(x) = Hx) - u{h) , 

which is to be solved for the unknown function / of x G R (or, at least, £ G [0, 1]). 
From the theory in Section 3 of |Dobl2] we know that there exists a unique solution 
A to defined on R, which is bounded on [0, 1]. For x G (0, 1) it is given by 

(2) 

MX) = „(1 Jj Mt) - ^" {t)dt = x(l -'),(.) / ' ( " W " Kft))?(t) * ' 

Furthermore, f\ is continuous at and 1 as long as h is. The following result is a 
special case of Proposition 3.7 in [D6bl2j. 

Lemma 2.2. Let h : R — > R fre Borel-measurable and v-integrable. 

(a) ///i bounded, then \\fh\\oo — 2||/i — ^(/i)||oo- 

(b) J//i zs Lipschitz, then \\fh\\oo < 2 1 1 /?,' 1 1 oo and ||/^||oo ^ Ci||^||ooj where the finite 
constant C\ does not depend on h. 

(c) // h is twice differentiable with bounded first and second derivative, then 

II /a II oo — C2 ( 1 1 1 1 00 + 1 1 ^"|| 00) ? where the finite constant C2 does not depend on 
h. 

Proof. Noting that | is the median for v with q(l/2) = ^, this follows immediately 
from Proposition 3.7 in |Dobl2] with a — b—\. □ 

Remark 2.3. Note that the bounds on the Stein solutions fh from Lemmas 3.3-3.5 
in |GR12] do not cover the case of the arcsine distribution, since they all impose 
the condition a, b G [l,oo). It is the restriction to these parameters, that allows 
the authors to provide explicit constants in place of C\ from Lemma 12.21 which 
furthermore yield explicit constants for the rate of convergence in the Polya urn 
model. 

Now, we turn to a suitable version of Stein's method for the distribution C(R m ) of 
R m . To do so, we will first give an explicit formula for the probability mass function 
p(k) of R m , which is given by a famous Theorem by Chung and Feller. First, however, 
we will need the following lemma. 

LEMMA 2.4. Let m be a positive integer. Then 

(a) X 2j _i = X 2j forj = l,...,m 

(b) T m has values in 2 • {0, . . . , m} and hence R m has values in {0, ... , m}. 
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(c) Letting Xj := 1 — Xj we have Xj = l{s-_ 1 <o, s <o} an d 



(Xi, . . . , X n ) — (X±, . . . , X n ). 

Proof. We prove (a) by induction on j. It is easy to see, that X% = X 2 always holds. 
Now let 1 < j < m — 1. Then we have X 2 j-\ = X 2 j by the induction hypothesis. 
Suppose, that X 2 j~\ = X 2 j = 1. If S 2 j = 0, then the claim X 2 j + i = X 2 j +2 follows in 
the same manner as Xi = X 2 . If S 2 j > 0, then necessarily S 2 j > 2, yielding .S^+i > 
and S 2 j + i > 0. Hence, X 2 j +1 = X 2 j = 1 in this case. If, contrarily, X 2 j_i = X 2 j = 0, 
the proof is similar. 

Assertion (b) follows immediately from (a). The first assertion from (c) is clear 
since either both, S 2 j-± and S 2 j, are nonnegative or nonpositive. Now, observe, that 
there is a (measurable) function / such that (X\, . . . , X n ) = f(S\,...,S n ). Since 

Xj = h-Si-i^-S^o} and (St, . . . , S n ) = (Si, S n ) by symmetry, we have 



(Xi, . . . , X n ) — f(S±, . . . , — S n ) — f(S±, . . . , S n ) — (X-i, . . . , X n ) . 

□ 

The proof of the following well-known theorem can be found for example in |Fel68j . 

THEOREM 2.5 (Chung- Feller Theorem). Let m be a positive integer. Then, for each 
< k < m we have 

P(R m = k) = P(T m = 2k) = u 2k u 2m - 2k , 

where uq := 1 and u 2 j := ( 2 J)2~i for j > 1 denotes the probability that the sym- 
metric random walk returns to zero at time 2j. 

Thus, by Theorem 12.51 the probability mass function p:Z^I corresponding to 
R m is given by p(k) = for k £ Z \ {0, . . . , m} and by 



(3) p(k) = u 2k u 2m - 2k = 2 2m 

for < k < in. 



2k\ (2m - 2k 
k ) V m — k 



In the following, we review the recent adaption of the so-called density approach for 
absolutely continuous distributions (see, e.g. [SDHR04J, [ EL 10] . |CSll j and [ CGS11] ) 
to discrete distributions on the integers, which was done in |GR12] and also in [LSJ. 
For reasons of simplicity we restrict ourselves to the case of finite integer intervals. 
A finite integer interval is a set I of the form / = [a, b] fl Z for some integers a < b. 
Given a probability mass function p : Z — > R with p(k) > for k G / and p(k) = 
for k G Z \ I, we consider the function ip : I — > M given by the formula 
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where for a function / on the integers Af(k) := f(k+l)—f(k) denotes the forward 
difference operator. Note that by definition always ip{b) = — 1 if J = [a, b] fl Z. For 
such a probability mass function p with support a finite integer interval I = [o,6]nZ, 
let .F(p) denote the class of all real-valued functions / on Z such that /(a — 1) = 0. 
The following result is a special case of Proposition 2.1 of |GR12] . 

PROPOSITION 2.6. Let Z be a Z-valued random variable with probability mass func- 
tion p which is supported on the finite integer interval I = [a, b] fl Z and is positive 
there. Then, a given random variable X with support I has the probability mass 
function p if and only if for all f G J~{jp) it holds that 



(5) E[Af(X-l)+i;(X)f(X)] = 0. 

The next result, a version of Corollary 2.1 from |GR12j . yields various other Stein 
characterizations for the distribution corresponding to p from Proposition 12.61 

COROLLARY 2.7. Let Z be a 'L-valued random variable with probability mass function 
p which is supported on the finite integer interval I = [a, b] fl Z and is positive there. 
Let c : [a — 1, b] D Z — > R \ {0} be an arbitrary function. Then, in order that a given 
random variable X with support I has the probability mass function p it is necessary 
and sufficient that for all functions f G J~{p) we have 



(6) E c(X - l)Af(X - 1) + [c(X)iP(X) + Ac(X - 1)] f(x) 



0. 



REMARK 2.8. Letting ^(k) := c(k)ip(k) + Ac(k — 1) we see that c satisfies the 
difference equation Ac(k — 1) = 'y(k) — c(k)ip(k). This exactly corresponds to the 
differential equation i]'(x) = j(x) — rj(x)i(j(x) from Formula (14) in |Dobl2] , where 
ift(x) := is the logarithmic derivative of the density p. In |Dobl2] it is shown, 
that this differential equation must hold, in order that a given distribution fi with 
density p satisfies the Stein identity E[rj(Z)g'(Z) + r y(Z)f(Z)] = 0, where Z ~ \i. So 
also in this respect, there is a strong analogy between the absolutely continuous and 
the discrete case. 

Now, with the abstract results at hand, we return to the concrete distribution of 
R m which has probability mass function p supported on / := [0, m] D Z and given by 
([2]). Using the relation 
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4j - 2 (2j - 2 



for j > 1 



J V J - 1 

it can easily be checked, that in this case ip is given by 



(7) 



2k-m+l 



< k < m . 



(k + 1) (2(m - Jfe) - l) ' 

This motivates the definition c(/c) := (k + l) (2(m — k) — 1) for fc 
These observations lead to the following lemma. 



, m. 



LEMMA 2.9. Let p be the probability mass function of R m as given by (TJ|). A random 
variable X with support I — [0, m] D Z /ias probability mass function p if and only if 
for all f G F{jp) it holds that 



E 



X((m - X) + i) A/(X " 1) + " 



0. 



Proof. This follows from Corollary 12. 7[ since by the definition of c we have 



c(k)il){k) + Ac(A; - I] 



and 



c(ife - 1) = fc(2(m - ife) + 1) = 2k(m - k + . 



3. A RATE OF CONVERGENCE FOR THE ARCSINE LAW 



□ 



In this section we will use the tools from Section 2 to prove a rate of convergence 
in the Wasserstein distance for the arcsine law. Recall, that for two distributions fii 
and fi2 on (M.,B), whose first moments exist, the Wasserstein distance is given by 



dw(fii, fa) := sup 
TieLip(i) 

and for two real-valued random variables X and Y one defines 



hdfii 



d w {X,Y) :=d w (C{X),C{Y)) = sup \E[h(X)\ - E [h(Y) 

heLip(l) 

where Lip(l) denotes the class of all Lipschitz-continuous functions h on K with 
minimal Lipschitz constant H/i'Hoo < 1. It is known, that on the space of probability 
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measures with existing first moment, convergence in the Wasserstein distance is 
stronger than weak convergence. 

THEOREM 3.1. There exists a finite constant C > such that for each positive 
integer m we have 



d w (C(W m ),u) <—. 
y ' m 

Remark 3.2. To the best of my knowledge this is the first result that gives a rate of 
convergence of order mT 1 for the arcsine law. The restriction to even times n = 2m 
is immaterial and only for convenience, since the formula from the Chung-Feller 
Theorem only holds for these times. Since for odd times n = 2m + 1 

2m+l 

< 



1 ^ 

2m + 1 ^ 3 

3=1 



2m +1 



the same rate of convergence also holds for the whole sequence of positive times 
of the random walk. 

It may be seen from the proof of Theorem 13. 1\ that the constant C can be made 
explicit in terms of the constant C\ from Lemma 12.21 

Proof of Theorem \3.1[ The proof follows the lines of the proof of Theorem 3.1 from 
[GR12] and is included for reasons of completeness. Using the notation from [GR12J, 
for a function / and y > let 



A y f(x):=f(x + y)-f(x). 

We also write W := W m and R := R m . Let h G Lip(l) be fixed and let / := fh be 
the corresponding solution to the Stein equation (OQ) given by ([2]) for x G (0, 1) but 
which we set equal to zero for x G R \ [0, 1]. Consider the function g(x) := f{x/m) 
which is zero on R \ [0, m\. Then by Lemma [2.91 and upon dividing by m in ([5]) we 
obtain 



-E 



m 



R((m-R) + ^)Ag(R-l) 



m 



-R)g(R) 



E 



Inserting this into the Stein identity resulting from the stein equation ([T]) we obtain 
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E[h(W)] -v(h) 
E 



(9) 



E 
E 



W(l-W)f(W)+{--W)f(W) 
W(l - W)f{W) - mW(fi -W) + ^)A 1/m f(w - 1 



W(l - W)f{W) - mW(l - W)A 1/m f[W - - 



m 



with 



(10) 







1*1 = i 

l il 2 


E 





WA 1/m f(w-- 
V m 



< 



Am 



by Lemma [2.21 (b) and by E'fW] = 1/2, which follows for example from Lemma 
12.41 (c) by symmetry. Using the fundamental theorem of calculus, we rewrite the 
remaining term in (Q as 



E 
E 

mE 



11) 



mE 



W(l - W)f'(W) - mW(l - W)A 1/m f(W 
W(l-W)(f'{W)-m J f'(t)dt 

i W{l-W)(f'{W)-f{t))dt 

W(l-W)f'(W)-t(l-t)f(t))dt 



1 

m 



w 
w 



w-±- 



+ E 2 



where 



Eo := -mE 



^(w{l-W)-t{l-t))f{t)dt 

' ^ m 

and hence, again by the fundamental theorem of calculus, 
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E 



t-W t-W 

I fit) \ (l-2s)dsdt 
Jw-— Jt 



< m(l + -)\\f'\\ 00 E 



(m + 2)||/'| 



w 
udu 



(W - t)dt 



(12) 



2m? 



2m? 



where we have used Lemma l2~2l for the last step and the inequality |1— 2s\ < fl+— ) 
for relevant values of s for the first step. 

It remains to deal with the first expectation in (TTTI) . Since f = fh solves the Stein 
equation ([I]) and by the fundamental theorem of calculus for Lebesgue integration, 
we obtain 



w 



mE 



mE 



mE 



(13) mE J (W(l-W)f'(W) -t(l-t)f'(t)Jdt 

pW i I 

/ i (h(W) - v{h) -{-- w)f{W) - h(t) + u(h) + (- - t)f{t))dt 

J J/J/ 

m 

/ i (h(W) - h(t) +{W- -)f(W) - (t - -)f(t))dt 

J J/J/ — 

771 

/ 1 ([ w h '^ ds + [ w {f^ + { s -l)f'^) ds ) dt 

The inner integrals from (fl3|) are bounded separately. As to the first one 
(14) 





rW rW 






rW 


mE 


/ / ti(s)dsdt 






/ (W- t)dt 
Jw-± 




Jw-— Jt 

771 







1 < 1 

2m ~ 2m 



For the second one, since |~ — si < — + l < 3/2 for the relevant values of s, we 
have 
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rW r 


m 


E 


Jw-— Jt 

m 



LJ, (/W + (-2)/'W)^ 



< m 



rn 



m 



[ ,\\f'\ 



E 



+\\\r\u) 



E 



w 

W-A Jt 

T™ 

w 



w 



dsdt 



(W - t)dt 



\\\r\ 



1 



2m 2 

4 + 3d 



(15) < (2 + - Cl ) MU -< 4m . 

where we have used Lemma [2T21 for the next to last inequality. Since h G Lip(l) was 
arbitrary, the conclusion of the theorem follows from (fTUj) . (|T2|) . ( JT4|) and ( JT5|) . □ 
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